Overview

Dataset statistics

Number of variables19
Number of observations336776
Missing cells46595
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory48.8 MiB
Average record size in memory152.0 B

Variable types

Categorical6
Numeric13

Alerts

year has constant value "2013" Constant
tailnum has a high cardinality: 4043 distinct values High cardinality
dest has a high cardinality: 105 distinct values High cardinality
time_hour has a high cardinality: 6936 distinct values High cardinality
dep_time is highly correlated with sched_dep_time and 3 other fieldsHigh correlation
sched_dep_time is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
sched_arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
arr_delay is highly correlated with dep_delayHigh correlation
air_time is highly correlated with distanceHigh correlation
distance is highly correlated with air_timeHigh correlation
hour is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_time is highly correlated with sched_dep_time and 3 other fieldsHigh correlation
sched_dep_time is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
sched_arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
arr_delay is highly correlated with dep_delayHigh correlation
air_time is highly correlated with distanceHigh correlation
distance is highly correlated with air_timeHigh correlation
hour is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_time is highly correlated with sched_dep_time and 3 other fieldsHigh correlation
sched_dep_time is highly correlated with dep_time and 3 other fieldsHigh correlation
arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
sched_arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
air_time is highly correlated with distanceHigh correlation
distance is highly correlated with air_timeHigh correlation
hour is highly correlated with dep_time and 3 other fieldsHigh correlation
carrier is highly correlated with year and 1 other fieldsHigh correlation
year is highly correlated with carrier and 1 other fieldsHigh correlation
origin is highly correlated with carrier and 1 other fieldsHigh correlation
dep_time is highly correlated with sched_dep_time and 3 other fieldsHigh correlation
sched_dep_time is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
sched_arr_time is highly correlated with dep_time and 3 other fieldsHigh correlation
arr_delay is highly correlated with dep_delayHigh correlation
carrier is highly correlated with flight and 3 other fieldsHigh correlation
flight is highly correlated with carrier and 2 other fieldsHigh correlation
origin is highly correlated with carrier and 1 other fieldsHigh correlation
air_time is highly correlated with carrier and 2 other fieldsHigh correlation
distance is highly correlated with carrier and 1 other fieldsHigh correlation
hour is highly correlated with dep_time and 3 other fieldsHigh correlation
dep_time has 8255 (2.5%) missing values Missing
dep_delay has 8255 (2.5%) missing values Missing
arr_time has 8713 (2.6%) missing values Missing
arr_delay has 9430 (2.8%) missing values Missing
air_time has 9430 (2.8%) missing values Missing
dep_delay has 16514 (4.9%) zeros Zeros
arr_delay has 5409 (1.6%) zeros Zeros
minute has 60696 (18.0%) zeros Zeros

Reproduction

Analysis started2022-03-09 03:59:05.575851
Analysis finished2022-03-09 03:59:39.157408
Duration33.58 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

year
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2013
336776 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013
2nd row2013
3rd row2013
4th row2013
5th row2013

Common Values

ValueCountFrequency (%)
2013336776
100.0%

Length

2022-03-08T22:59:39.209825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-08T22:59:39.253269image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
2013336776
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

month
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.548509989
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:39.288047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.414457245
Coefficient of variation (CV)0.5214097941
Kurtosis-1.186950055
Mean6.548509989
Median Absolute Deviation (MAD)3
Skewness-0.01339988513
Sum2205381
Variance11.65851828
MonotonicityNot monotonic
2022-03-08T22:59:39.343380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
729425
8.7%
829327
8.7%
1028889
8.6%
328834
8.6%
528796
8.6%
428330
8.4%
628243
8.4%
1228135
8.4%
927574
8.2%
1127268
8.1%
Other values (2)51955
15.4%
ValueCountFrequency (%)
127004
8.0%
224951
7.4%
328834
8.6%
428330
8.4%
528796
8.6%
628243
8.4%
729425
8.7%
829327
8.7%
927574
8.2%
1028889
8.6%
ValueCountFrequency (%)
1228135
8.4%
1127268
8.1%
1028889
8.6%
927574
8.2%
829327
8.7%
729425
8.7%
628243
8.4%
528796
8.6%
428330
8.4%
328834
8.6%

day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.71078699
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:39.406526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.768607102
Coefficient of variation (CV)0.5581265347
Kurtosis-1.185945406
Mean15.71078699
Median Absolute Deviation (MAD)8
Skewness0.007744499321
Sum5291016
Variance76.8884705
MonotonicityNot monotonic
2022-03-08T22:59:39.470447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1811399
 
3.4%
1111359
 
3.4%
2211345
 
3.4%
1511317
 
3.4%
811271
 
3.3%
1011227
 
3.3%
1711222
 
3.3%
311211
 
3.3%
2111141
 
3.3%
2011111
 
3.3%
Other values (21)224173
66.6%
ValueCountFrequency (%)
111036
3.3%
210808
3.2%
311211
3.3%
411059
3.3%
510858
3.2%
611059
3.3%
710985
3.3%
811271
3.3%
910857
3.2%
1011227
3.3%
ValueCountFrequency (%)
316190
1.8%
3010289
3.1%
2910039
3.0%
2810773
3.2%
2711084
3.3%
2610883
3.2%
2511097
3.3%
2411041
3.3%
2310966
3.3%
2211345
3.4%

dep_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1318
Distinct (%)0.4%
Missing8255
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean1349.109947
Minimum1
Maximum2400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:39.542729image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile624
Q1907
median1401
Q31744
95-th percentile2112
Maximum2400
Range2399
Interquartile range (IQR)837

Descriptive statistics

Standard deviation488.281791
Coefficient of variation (CV)0.3619288346
Kurtosis-1.088319991
Mean1349.109947
Median Absolute Deviation (MAD)428
Skewness-0.02474345303
Sum443210949
Variance238419.1074
MonotonicityNot monotonic
2022-03-08T22:59:39.629980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
555834
 
0.2%
755820
 
0.2%
556818
 
0.2%
557799
 
0.2%
655798
 
0.2%
1455774
 
0.2%
1454769
 
0.2%
654751
 
0.2%
855743
 
0.2%
754742
 
0.2%
Other values (1308)320673
95.2%
(Missing)8255
 
2.5%
ValueCountFrequency (%)
125
< 0.1%
235
< 0.1%
326
< 0.1%
426
< 0.1%
521
< 0.1%
622
< 0.1%
722
< 0.1%
823
< 0.1%
928
< 0.1%
1022
< 0.1%
ValueCountFrequency (%)
240029
 
< 0.1%
235955
< 0.1%
235876
< 0.1%
235774
< 0.1%
235674
< 0.1%
235582
< 0.1%
235469
< 0.1%
235368
< 0.1%
235268
< 0.1%
235157
< 0.1%

sched_dep_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1021
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1344.25484
Minimum106
Maximum2359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:39.719015image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum106
5-th percentile630
Q1906
median1359
Q31729
95-th percentile2050
Maximum2359
Range2253
Interquartile range (IQR)823

Descriptive statistics

Standard deviation467.3357557
Coefficient of variation (CV)0.3476541366
Kurtosis-1.197903099
Mean1344.25484
Median Absolute Deviation (MAD)414
Skewness-0.00585808289
Sum452712768
Variance218402.7086
MonotonicityNot monotonic
2022-03-08T22:59:39.834217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6007016
 
2.1%
7004900
 
1.5%
6304770
 
1.4%
9004766
 
1.4%
12004624
 
1.4%
17004526
 
1.3%
16004098
 
1.2%
8003926
 
1.2%
13003689
 
1.1%
19003653
 
1.1%
Other values (1011)290808
86.4%
ValueCountFrequency (%)
1061
 
< 0.1%
500341
0.1%
5011
 
< 0.1%
5052
 
< 0.1%
5105
 
< 0.1%
515208
0.1%
5164
 
< 0.1%
51728
 
< 0.1%
5207
 
< 0.1%
52537
 
< 0.1%
ValueCountFrequency (%)
2359828
0.2%
235844
 
< 0.1%
235573
 
< 0.1%
235216
 
< 0.1%
23451
 
< 0.1%
23391
 
< 0.1%
233014
 
< 0.1%
23151
 
< 0.1%
230561
 
< 0.1%
230022
 
< 0.1%

dep_delay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct527
Distinct (%)0.2%
Missing8255
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean12.63907026
Minimum-43
Maximum1301
Zeros16514
Zeros (%)4.9%
Negative183575
Negative (%)54.5%
Memory size2.6 MiB
2022-03-08T22:59:39.953416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-43
5-th percentile-9
Q1-5
median-2
Q311
95-th percentile88
Maximum1301
Range1344
Interquartile range (IQR)16

Descriptive statistics

Standard deviation40.21006089
Coefficient of variation (CV)3.181409714
Kurtosis43.95011603
Mean12.63907026
Median Absolute Deviation (MAD)4
Skewness4.802540511
Sum4152200
Variance1616.848997
MonotonicityNot monotonic
2022-03-08T22:59:40.074777image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-524821
 
7.4%
-424619
 
7.3%
-324218
 
7.2%
-221516
 
6.4%
-620701
 
6.1%
-118813
 
5.6%
-716752
 
5.0%
016514
 
4.9%
-811791
 
3.5%
18050
 
2.4%
Other values (517)140726
41.8%
(Missing)8255
 
2.5%
ValueCountFrequency (%)
-431
 
< 0.1%
-331
 
< 0.1%
-321
 
< 0.1%
-301
 
< 0.1%
-271
 
< 0.1%
-261
 
< 0.1%
-252
 
< 0.1%
-244
 
< 0.1%
-236
< 0.1%
-2211
< 0.1%
ValueCountFrequency (%)
13011
< 0.1%
11371
< 0.1%
11261
< 0.1%
10141
< 0.1%
10051
< 0.1%
9601
< 0.1%
9111
< 0.1%
8991
< 0.1%
8981
< 0.1%
8961
< 0.1%

arr_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1411
Distinct (%)0.4%
Missing8713
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean1502.054999
Minimum1
Maximum2400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:40.190910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile736
Q11104
median1535
Q31940
95-th percentile2248
Maximum2400
Range2399
Interquartile range (IQR)836

Descriptive statistics

Standard deviation533.264132
Coefficient of variation (CV)0.3550230401
Kurtosis-0.1926343839
Mean1502.054999
Median Absolute Deviation (MAD)418
Skewness-0.4678190642
Sum492768669
Variance284370.6345
MonotonicityNot monotonic
2022-03-08T22:59:40.317281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1008485
 
0.1%
1013484
 
0.1%
1015479
 
0.1%
1012464
 
0.1%
1005460
 
0.1%
1016459
 
0.1%
1006459
 
0.1%
1011457
 
0.1%
1007456
 
0.1%
1040455
 
0.1%
Other values (1401)323405
96.0%
(Missing)8713
 
2.6%
ValueCountFrequency (%)
1201
0.1%
2164
< 0.1%
3174
0.1%
4173
0.1%
5206
0.1%
6148
< 0.1%
7170
0.1%
8147
< 0.1%
9140
< 0.1%
10178
0.1%
ValueCountFrequency (%)
2400150
< 0.1%
2359222
0.1%
2358189
0.1%
2357207
0.1%
2356202
0.1%
2355206
0.1%
2354195
0.1%
2353182
0.1%
2352193
0.1%
2351216
0.1%

sched_arr_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1163
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1536.38022
Minimum1
Maximum2359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:40.442823image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile815
Q11124
median1556
Q31945
95-th percentile2246
Maximum2359
Range2358
Interquartile range (IQR)821

Descriptive statistics

Standard deviation497.4571415
Coefficient of variation (CV)0.323785177
Kurtosis-0.3822477902
Mean1536.38022
Median Absolute Deviation (MAD)417
Skewness-0.3531380695
Sum517415985
Variance247463.6076
MonotonicityNot monotonic
2022-03-08T22:59:40.565172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10251324
 
0.4%
20151234
 
0.4%
11101198
 
0.4%
11151193
 
0.4%
12351133
 
0.3%
23591121
 
0.3%
18151111
 
0.3%
10151080
 
0.3%
16451079
 
0.3%
12201073
 
0.3%
Other values (1153)325230
96.6%
ValueCountFrequency (%)
1243
0.1%
295
 
< 0.1%
3159
< 0.1%
4107
< 0.1%
582
 
< 0.1%
619
 
< 0.1%
785
 
< 0.1%
8154
< 0.1%
955
 
< 0.1%
1072
 
< 0.1%
ValueCountFrequency (%)
23591121
0.3%
2358483
0.1%
2357349
 
0.1%
2356468
0.1%
2355335
 
0.1%
2354384
 
0.1%
2353263
 
0.1%
235247
 
< 0.1%
2351140
 
< 0.1%
2350105
 
< 0.1%

arr_delay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct577
Distinct (%)0.2%
Missing9430
Missing (%)2.8%
Infinite0
Infinite (%)0.0%
Mean6.895376757
Minimum-86
Maximum1272
Zeros5409
Zeros (%)1.6%
Negative188933
Negative (%)56.1%
Memory size2.6 MiB
2022-03-08T22:59:40.652279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-86
5-th percentile-32
Q1-17
median-5
Q314
95-th percentile91
Maximum1272
Range1358
Interquartile range (IQR)31

Descriptive statistics

Standard deviation44.63329169
Coefficient of variation (CV)6.472930089
Kurtosis29.233044
Mean6.895376757
Median Absolute Deviation (MAD)14
Skewness3.71681748
Sum2257174
Variance1992.130727
MonotonicityNot monotonic
2022-03-08T22:59:41.012056image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-137177
 
2.1%
-107088
 
2.1%
-127046
 
2.1%
-146975
 
2.1%
-116863
 
2.0%
-96815
 
2.0%
-156796
 
2.0%
-76677
 
2.0%
-176668
 
2.0%
-86663
 
2.0%
Other values (567)258578
76.8%
(Missing)9430
 
2.8%
ValueCountFrequency (%)
-861
 
< 0.1%
-791
 
< 0.1%
-752
 
< 0.1%
-741
 
< 0.1%
-731
 
< 0.1%
-713
 
< 0.1%
-708
< 0.1%
-697
< 0.1%
-6812
< 0.1%
-677
< 0.1%
ValueCountFrequency (%)
12721
< 0.1%
11271
< 0.1%
11091
< 0.1%
10071
< 0.1%
9891
< 0.1%
9311
< 0.1%
9151
< 0.1%
8951
< 0.1%
8781
< 0.1%
8751
< 0.1%

carrier
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
UA
58665 
B6
54635 
EV
54173 
DL
48110 
AA
32729 
Other values (11)
88464 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUA
2nd rowUA
3rd rowAA
4th rowB6
5th rowDL

Common Values

ValueCountFrequency (%)
UA58665
17.4%
B654635
16.2%
EV54173
16.1%
DL48110
14.3%
AA32729
9.7%
MQ26397
7.8%
US20536
 
6.1%
9E18460
 
5.5%
WN12275
 
3.6%
VX5162
 
1.5%
Other values (6)5634
 
1.7%

Length

2022-03-08T22:59:41.129835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ua58665
17.4%
b654635
16.2%
ev54173
16.1%
dl48110
14.3%
aa32729
9.7%
mq26397
7.8%
us20536
 
6.1%
9e18460
 
5.5%
wn12275
 
3.6%
vx5162
 
1.5%
Other values (6)5634
 
1.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

flight
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3844
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1971.92362
Minimum1
Maximum8500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:41.252294image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile91
Q1553
median1496
Q33465
95-th percentile4695
Maximum8500
Range8499
Interquartile range (IQR)2912

Descriptive statistics

Standard deviation1632.471938
Coefficient of variation (CV)0.8278575913
Kurtosis-0.8485606835
Mean1971.92362
Median Absolute Deviation (MAD)1085
Skewness0.6616036349
Sum664096549
Variance2664964.629
MonotonicityNot monotonic
2022-03-08T22:59:41.364758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15968
 
0.3%
27898
 
0.3%
181882
 
0.3%
301871
 
0.3%
161786
 
0.2%
695782
 
0.2%
1109716
 
0.2%
745711
 
0.2%
359709
 
0.2%
1701
 
0.2%
Other values (3834)328752
97.6%
ValueCountFrequency (%)
1701
0.2%
251
 
< 0.1%
3631
0.2%
4393
0.1%
5324
0.1%
6210
 
0.1%
7237
 
0.1%
8236
 
0.1%
9153
 
< 0.1%
1061
 
< 0.1%
ValueCountFrequency (%)
85001
 
< 0.1%
618180
< 0.1%
61806
 
< 0.1%
6177164
< 0.1%
61711
 
< 0.1%
61682
 
< 0.1%
61673
 
< 0.1%
61651
 
< 0.1%
61401
 
< 0.1%
61382
 
< 0.1%

tailnum
Categorical

HIGH CARDINALITY

Distinct4043
Distinct (%)1.2%
Missing2512
Missing (%)0.7%
Memory size2.6 MiB
N725MQ
 
575
N722MQ
 
513
N723MQ
 
507
N711MQ
 
486
N713MQ
 
483
Other values (4038)
331700 

Length

Max length6
Median length6
Mean length5.995222339
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)0.1%

Sample

1st rowN14228
2nd rowN24211
3rd rowN619AA
4th rowN804JB
5th rowN668DN

Common Values

ValueCountFrequency (%)
N725MQ575
 
0.2%
N722MQ513
 
0.2%
N723MQ507
 
0.2%
N711MQ486
 
0.1%
N713MQ483
 
0.1%
N258JB427
 
0.1%
N298JB407
 
0.1%
N353JB404
 
0.1%
N351JB402
 
0.1%
N735MQ396
 
0.1%
Other values (4033)329664
97.9%
(Missing)2512
 
0.7%

Length

2022-03-08T22:59:41.477137image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
n725mq575
 
0.2%
n722mq513
 
0.2%
n723mq507
 
0.2%
n711mq486
 
0.1%
n713mq483
 
0.1%
n258jb427
 
0.1%
n298jb407
 
0.1%
n353jb404
 
0.1%
n351jb402
 
0.1%
n735mq396
 
0.1%
Other values (4033)329664
98.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

origin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
EWR
120835 
JFK
111279 
LGA
104662 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEWR
2nd rowLGA
3rd rowJFK
4th rowJFK
5th rowLGA

Common Values

ValueCountFrequency (%)
EWR120835
35.9%
JFK111279
33.0%
LGA104662
31.1%

Length

2022-03-08T22:59:41.580104image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-08T22:59:41.646314image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
ewr120835
35.9%
jfk111279
33.0%
lga104662
31.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

dest
Categorical

HIGH CARDINALITY

Distinct105
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
ORD
 
17283
ATL
 
17215
LAX
 
16174
BOS
 
15508
MCO
 
14082
Other values (100)
256514 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowIAH
2nd rowIAH
3rd rowMIA
4th rowBQN
5th rowATL

Common Values

ValueCountFrequency (%)
ORD17283
 
5.1%
ATL17215
 
5.1%
LAX16174
 
4.8%
BOS15508
 
4.6%
MCO14082
 
4.2%
CLT14064
 
4.2%
SFO13331
 
4.0%
FLL12055
 
3.6%
MIA11728
 
3.5%
DCA9705
 
2.9%
Other values (95)195631
58.1%

Length

2022-03-08T22:59:41.716481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ord17283
 
5.1%
atl17215
 
5.1%
lax16174
 
4.8%
bos15508
 
4.6%
mco14082
 
4.2%
clt14064
 
4.2%
sfo13331
 
4.0%
fll12055
 
3.6%
mia11728
 
3.5%
dca9705
 
2.9%
Other values (95)195631
58.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

air_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct509
Distinct (%)0.2%
Missing9430
Missing (%)2.8%
Infinite0
Infinite (%)0.0%
Mean150.6864602
Minimum20
Maximum695
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:41.815868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile40
Q182
median129
Q3192
95-th percentile339
Maximum695
Range675
Interquartile range (IQR)110

Descriptive statistics

Standard deviation93.68830466
Coefficient of variation (CV)0.6217433506
Kurtosis0.8630769908
Mean150.6864602
Median Absolute Deviation (MAD)51
Skewness1.070705186
Sum49326610
Variance8777.49843
MonotonicityNot monotonic
2022-03-08T22:59:41.896392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
422552
 
0.8%
432543
 
0.8%
412513
 
0.7%
452495
 
0.7%
402466
 
0.7%
442444
 
0.7%
392411
 
0.7%
472409
 
0.7%
462406
 
0.7%
1092377
 
0.7%
Other values (499)302730
89.9%
(Missing)9430
 
2.8%
ValueCountFrequency (%)
202
 
< 0.1%
2114
 
< 0.1%
2234
 
< 0.1%
2382
 
< 0.1%
24103
< 0.1%
25124
< 0.1%
26169
0.1%
27147
< 0.1%
28180
0.1%
29209
0.1%
ValueCountFrequency (%)
6951
< 0.1%
6911
< 0.1%
6862
< 0.1%
6831
< 0.1%
6791
< 0.1%
6762
< 0.1%
6751
< 0.1%
6712
< 0.1%
6691
< 0.1%
6672
< 0.1%

distance
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct214
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1039.912604
Minimum17
Maximum4983
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:41.974128image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile199
Q1502
median872
Q31389
95-th percentile2475
Maximum4983
Range4966
Interquartile range (IQR)887

Descriptive statistics

Standard deviation733.2330333
Coefficient of variation (CV)0.7050910151
Kurtosis1.193639906
Mean1039.912604
Median Absolute Deviation (MAD)384
Skewness1.128690151
Sum350217607
Variance537630.6812
MonotonicityNot monotonic
2022-03-08T22:59:42.050104image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
247511262
 
3.3%
76210263
 
3.0%
7338857
 
2.6%
25868204
 
2.4%
5446168
 
1.8%
7196100
 
1.8%
1875898
 
1.8%
10965781
 
1.7%
24545695
 
1.7%
1845504
 
1.6%
Other values (204)263044
78.1%
ValueCountFrequency (%)
171
 
< 0.1%
8049
 
< 0.1%
94976
 
0.3%
96607
 
0.2%
116443
 
0.1%
143439
 
0.1%
160376
 
0.1%
169545
 
0.2%
173221
 
0.1%
1845504
1.6%
ValueCountFrequency (%)
4983342
 
0.1%
4963365
 
0.1%
33708
 
< 0.1%
25868204
2.4%
2576312
 
0.1%
2569329
 
0.1%
25655127
1.5%
2521284
 
0.1%
247511262
3.3%
24651039
 
0.3%

hour
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.1802474
Minimum1
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:42.116854image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q19
median13
Q317
95-th percentile20
Maximum23
Range22
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.661315708
Coefficient of variation (CV)0.3536591966
Kurtosis-1.206416089
Mean13.1802474
Median Absolute Deviation (MAD)4
Skewness-0.0005426517817
Sum4438791
Variance21.72786413
MonotonicityNot monotonic
2022-03-08T22:59:42.175854image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
827242
 
8.1%
625951
 
7.7%
1724426
 
7.3%
1523888
 
7.1%
1623002
 
6.8%
722821
 
6.8%
1821783
 
6.5%
1421706
 
6.4%
1921441
 
6.4%
920312
 
6.0%
Other values (10)104204
30.9%
ValueCountFrequency (%)
11
 
< 0.1%
51953
 
0.6%
625951
7.7%
722821
6.8%
827242
8.1%
920312
6.0%
1016708
5.0%
1116033
4.8%
1218181
5.4%
1319956
5.9%
ValueCountFrequency (%)
231061
 
0.3%
222639
 
0.8%
2110933
3.2%
2016739
5.0%
1921441
6.4%
1821783
6.5%
1724426
7.3%
1623002
6.8%
1523888
7.1%
1421706
6.4%

minute
Real number (ℝ≥0)

ZEROS

Distinct60
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.23009953
Minimum0
Maximum59
Zeros60696
Zeros (%)18.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2022-03-08T22:59:42.245426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q18
median29
Q344
95-th percentile58
Maximum59
Range59
Interquartile range (IQR)36

Descriptive statistics

Standard deviation19.30084566
Coefficient of variation (CV)0.7358281517
Kurtosis-1.235018012
Mean26.23009953
Median Absolute Deviation (MAD)16
Skewness0.09293094675
Sum8833668
Variance372.5226431
MonotonicityNot monotonic
2022-03-08T22:59:42.319556image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
060696
18.0%
3033899
 
10.1%
4520398
 
6.1%
1518868
 
5.6%
5518834
 
5.6%
5916288
 
4.8%
1014503
 
4.3%
2514450
 
4.3%
514118
 
4.2%
2913823
 
4.1%
Other values (50)110899
32.9%
ValueCountFrequency (%)
060696
18.0%
12116
 
0.6%
2848
 
0.3%
31439
 
0.4%
41357
 
0.4%
514118
 
4.2%
61381
 
0.4%
71092
 
0.3%
81695
 
0.5%
91445
 
0.4%
ValueCountFrequency (%)
5916288
4.8%
581065
 
0.3%
571388
 
0.4%
561713
 
0.5%
5518834
5.6%
541405
 
0.4%
531382
 
0.4%
521281
 
0.4%
511184
 
0.4%
5012508
3.7%

time_hour
Categorical

HIGH CARDINALITY

Distinct6936
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2013-09-13T12:00:00Z
 
94
2013-09-20T12:00:00Z
 
94
2013-09-09T12:00:00Z
 
93
2013-09-16T12:00:00Z
 
93
2013-09-23T12:00:00Z
 
93
Other values (6931)
336309 

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)< 0.1%

Sample

1st row2013-01-01T10:00:00Z
2nd row2013-01-01T10:00:00Z
3rd row2013-01-01T10:00:00Z
4th row2013-01-01T10:00:00Z
5th row2013-01-01T11:00:00Z

Common Values

ValueCountFrequency (%)
2013-09-13T12:00:00Z94
 
< 0.1%
2013-09-20T12:00:00Z94
 
< 0.1%
2013-09-09T12:00:00Z93
 
< 0.1%
2013-09-16T12:00:00Z93
 
< 0.1%
2013-09-23T12:00:00Z93
 
< 0.1%
2013-09-19T12:00:00Z92
 
< 0.1%
2013-10-11T12:00:00Z92
 
< 0.1%
2013-09-10T12:00:00Z91
 
< 0.1%
2013-10-09T12:00:00Z91
 
< 0.1%
2013-09-12T12:00:00Z91
 
< 0.1%
Other values (6926)335852
99.7%

Length

2022-03-08T22:59:42.390419image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2013-09-13t12:00:00z94
 
< 0.1%
2013-09-20t12:00:00z94
 
< 0.1%
2013-09-09t12:00:00z93
 
< 0.1%
2013-09-23t12:00:00z93
 
< 0.1%
2013-09-16t12:00:00z93
 
< 0.1%
2013-09-19t12:00:00z92
 
< 0.1%
2013-10-11t12:00:00z92
 
< 0.1%
2013-09-24t12:00:00z91
 
< 0.1%
2013-10-01t12:00:00z91
 
< 0.1%
2013-09-18t12:00:00z91
 
< 0.1%
Other values (6926)335852
99.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-03-08T22:59:36.124481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.380910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.678124image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.425669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.738405image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.076284image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.441942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.800354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.141579image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.478672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.058688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.458103image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.803427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.225903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.481608image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.214990image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.525191image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.837998image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.178122image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.545896image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.901897image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.243764image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.579701image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.164669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.559723image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.900981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.331757image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.584573image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.320130image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.624126image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.940965image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.281057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.648931image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.007591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.344545image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.685439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.270392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.665859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.002696image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.434328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.685906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.422951image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.724623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.044291image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.384842image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.753430image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.109714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.446672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.787197image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.379479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.769069image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.105085image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.537620image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.784787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.524098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.826636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.143836image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.489957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.859367image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.213631image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.550469image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.891422image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.488579image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.871957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.202667image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.644382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.887389image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.627456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.925897image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.254224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.594358image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.963547image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.319658image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.651464image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.241429image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.596938image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.978823image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.310002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.746468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:19.987950image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.726513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.028367image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.358462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.700808image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.067621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.422721image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.752812image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.342125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.705541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.082139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.414933image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.849230image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.085056image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.825891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.124112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.458014image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.801319image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.167557image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.524692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.851726image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.442356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.808831image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.184364image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.514446image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.955052image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.183944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:21.926442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.228167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.565397image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:25.910928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.273864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.628731image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.955588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.546469image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:32.917646image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.289089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.619533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:37.061436image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.282553image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.025268image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.323987image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.665164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.011628image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.373828image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.730390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.053814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.647461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.020433image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.391486image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.717968image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:37.168765image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.388718image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.127867image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.428461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.773439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.121093image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.481980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.837703image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.161872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.753323image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.132437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.498697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.826674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:37.270852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.485190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.226114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.529845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.874762image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.224812image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.586171image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:28.939405image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.264294image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.855170image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.239502image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.601368image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:35.926919image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:37.370868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:20.580086image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:22.323518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:23.636457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:24.971681image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:26.334733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:27.694994image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:29.038725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:30.373225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:31.954516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:33.350913image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:34.702244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-08T22:59:36.022896image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-03-08T22:59:42.454404image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-08T22:59:42.559610image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-08T22:59:42.664651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-08T22:59:42.757268image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-03-08T22:59:42.826129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-08T22:59:37.567900image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-08T22:59:38.127351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-03-08T22:59:38.787509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-03-08T22:59:38.954245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

yearmonthdaydep_timesched_dep_timedep_delayarr_timesched_arr_timearr_delaycarrierflighttailnumorigindestair_timedistancehourminutetime_hour
0201311517.05152.0830.081911.0UA1545N14228EWRIAH227.014005152013-01-01T10:00:00Z
1201311533.05294.0850.083020.0UA1714N24211LGAIAH227.014165292013-01-01T10:00:00Z
2201311542.05402.0923.085033.0AA1141N619AAJFKMIA160.010895402013-01-01T10:00:00Z
3201311544.0545-1.01004.01022-18.0B6725N804JBJFKBQN183.015765452013-01-01T10:00:00Z
4201311554.0600-6.0812.0837-25.0DL461N668DNLGAATL116.0762602013-01-01T11:00:00Z
5201311554.0558-4.0740.072812.0UA1696N39463EWRORD150.07195582013-01-01T10:00:00Z
6201311555.0600-5.0913.085419.0B6507N516JBEWRFLL158.01065602013-01-01T11:00:00Z
7201311557.0600-3.0709.0723-14.0EV5708N829ASLGAIAD53.0229602013-01-01T11:00:00Z
8201311557.0600-3.0838.0846-8.0B679N593JBJFKMCO140.0944602013-01-01T11:00:00Z
9201311558.0600-2.0753.07458.0AA301N3ALAALGAORD138.0733602013-01-01T11:00:00Z

Last rows

yearmonthdaydep_timesched_dep_timedep_delayarr_timesched_arr_timearr_delaycarrierflighttailnumorigindestair_timedistancehourminutetime_hour
33676620139302240.02250-10.02347.07-20.0B62002N281JBJFKBUF52.030122502013-10-01T02:00:00Z
33676720139302241.02246-5.02345.01-16.0B6486N346JBJFKROC47.026422462013-10-01T02:00:00Z
33676820139302307.0225512.02359.023581.0B6718N565JBJFKBOS33.018722552013-10-01T02:00:00Z
33676920139302349.02359-10.0325.0350-25.0B6745N516JBJFKPSE196.0161723592013-10-01T03:00:00Z
3367702013930NaN1842NaNNaN2019NaNEV5274N740EVLGABNANaN76418422013-09-30T22:00:00Z
3367712013930NaN1455NaNNaN1634NaN9E3393NaNJFKDCANaN21314552013-09-30T18:00:00Z
3367722013930NaN2200NaNNaN2312NaN9E3525NaNLGASYRNaN1982202013-10-01T02:00:00Z
3367732013930NaN1210NaNNaN1330NaNMQ3461N535MQLGABNANaN76412102013-09-30T16:00:00Z
3367742013930NaN1159NaNNaN1344NaNMQ3572N511MQLGACLENaN41911592013-09-30T15:00:00Z
3367752013930NaN840NaNNaN1020NaNMQ3531N839MQLGARDUNaN4318402013-09-30T12:00:00Z